Fujisaki model based F0 contours in vietnamese TTS
نویسندگان
چکیده
The current paper presents preliminary work towards the integration of the Fujisaki model into the VnVoice Vietnamese TTS system, based on a set of rules to control the F0 contour. A speech corpus consisting of 20 sentences was compiled. Each of the sentences can have various meanings depending on the tone associated with a monosyllabic keyword which it contains. The corpus with a total of 46 sentences was recorded by a female speaker whose voice had also been used in the speech corpus for VnVoice, and labeled at the syllabic level. Tone contrast perception results and naturalness comparisons show that the Fujisaki model works well in modeling F0 contour of Vietnamese tones.
منابع مشابه
A novel approach to the fully automatic extraction of Fujisaki model parameters
The generation of naturally-sounding F0 contours in TTS is important for the intellegibility and perceived naturalness of synthetic speech. In earlier works the author developed a linguistically motivated model of German intonation based on the quantitative Fujisaki model of the production process of F0. The extraction of parameters for this model from the extracted F0 contour, however, poses p...
متن کاملTowards the automatic extraction of fujisaki model parameters for Mandarin
The generation of naturally-sounding F0 contours in TTS enhances the intelligibility and perceived naturalness of synthetic speech. In earlier works the first author developed a linguistically motivated model of German intonation based on the quantitative Fujisaki model of the production process of F0, and an automatic procedure for extracting the parameters from the F0 contour which, however, ...
متن کاملAn Overview of Prosodic Modelling for Croatian Speech Synthesis
In order to include prosody into the text to speech (TTS) systems prosody knowledge needs to be acquired, represented and incorporated. Two main features of prosody important for modelling prosody for TTS systems are duration and F0 contour. There are various approaches to modelling those features and they can be categorized into three main groups: rule based, statistical and minimalistic. Some...
متن کاملA targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis
Superpositional model of fundamental frequency (F0) contours as suggested by the Fujisaki model can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, improvement of HMM-based speech synthesis is expected by using the merit of superpositional model. In this paper, a targets-based superpositional model is proposed in the light of ...
متن کاملStatistical Approach to Fujisaki-model Parameter Estimation from Speech Signals and Its Quantitative Evaluation
We have previously proposed a statistical model of speech F0 contours, which is based on the discrete-time version of the Fujisaki model. One advantage of this model is that it allows us to introduce statistical methods to learn the Fujisaki-model parameters from speech F0 contours. This paper proposes several modifications to our previous model and parameter inference algorithm, and quantitati...
متن کامل